49 research outputs found

    An Evolutionary Algorithm for Discovering Multi-Relational Association Rules in the Semantic Web

    Get PDF
    International audienceIn the Semantic Web context, OWL ontologies represent the conceptualization of domains of interest while the corresponding assertional knowledge is given by RDF data referring to them. Because of its open, distributed, and collaborative nature, such knowledge can be incomplete, noisy, and sometimes inconsistent. By exploiting the evidence coming from the assertional data, we aim at discovering hidden knowledge patterns in the form of multi-relational association rules while taking advantage of the intensional knowledge available in ontological knowledge bases. An evolutionary search method applied to populated ontological knowledge bases is proposed for finding rules with a high inductive power. The proposed method, EDMAR, uses problem-aware genetic operators, echoing the refinement operators of ILP, and takes the intensional knowledge into account, which allows it to restrict and guide the search. Discovered rules are coded in SWRL, and as such they can be straightforwardly integrated within the ontology, thus enriching its expressive power and augmenting the assertional knowledge that can be derived. Additionally , discovered rules may also suggest new axioms to be added to the ontology. We performed experiments on publicly available ontologies, validating the performances of our approach and comparing them with the main state-of-the-art systems

    Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited

    Get PDF
    Since the late 1990s predicate invention has been under-explored within inductive logic programming due to difficulties in formulating efficient search mechanisms. However, a recent paper demonstrated that both predicate invention and the learning of recursion can be efficiently implemented for regular and context-free grammars, by way of metalogical substitutions with respect to a modified Prolog meta-interpreter which acts as the learning engine. New predicate symbols are introduced as constants representing existentially quantified higher-order variables. The approach demonstrates that predicate invention can be treated as a form of higher-order logical reasoning. In this paper we generalise the approach of meta-interpretive learning (MIL) to that of learning higher-order dyadic datalog programs. We show that with an infinite signature the higher-order dyadic datalog class H2 2 has universal Turing expressivity though H2 2 is decidable given a finite signature. Additionally we show that Knuth–Bendix ordering of the hypothesis space together with logarithmic clause bounding allows our MIL implementation MetagolD to PAC-learn minimal cardinality H2 2 definitions. This result is consistent with our experiments which indicate that MetagolD efficiently learns compact H2 2 definitions involving predicate invention for learning robotic strategies, the East–West train challenge and NELL. Additionally higher-order concepts were learned in the NELL language learning domain. The Metagol code and datasets described in this paper have been made publicly available on a website to allow reproduction of results in this paper

    Human-machine scientific discovery

    Get PDF
    International audienceHumanity is facing existential, societal challenges related to food security, ecosystem conservation, antimicrobial resistance, etc, and Artificial Intelligence (AI) is already playing an important role in tackling these new challenges. Most current AI approaches are limited when it comes to ‘knowledge transfer’ with humans, i.e. it is difficult to incorporate existing human knowledge and also the output knowledge is not human comprehensible. In this chapter we demonstrate how a combination of comprehensible machine learning, text-mining and domain knowledge could enhance human-machine collaboration for the purpose of automated scientific discovery where humans and computers jointly develop and evaluate scientific theories. As a case study, we describe a combination of logic-based machine learning (which included human-encoded ecological background knowledge) and text-mining from scientific publications (to verify machine-learned hypotheses) for the purpose of automated discovery of ecological interaction networks (food-webs) to detect change in agricultural ecosystems using the Farm Scale Evaluations (FSEs) of genetically modified herbicide-tolerant (GMHT) crops dataset. The results included novel food-web hypotheses, some confirmed by subsequent experimental studies (e.g. DNA analysis) and published in scientific journals. These machine-leaned food-webs were also used as the basis of a recent study revealing resilience of agro-ecosystems to changes in farming management using GMHT crops

    Next-Generation Global Biomonitoring: Large-scale, Automated Reconstruction of Ecological Networks

    Get PDF
    We foresee a new global-scale, ecological approach to biomonitoring emerging within the next decade that can detect ecosystem change accurately, cheaply, and generically. Next-generation sequencing of DNA sampled from the Earth's environments would provide data for the relative abundance of operational taxonomic units or ecological functions. Machine-learning methods would then be used to reconstruct the ecological networks of interactions implicit in the raw NGS data. Ultimately, we envision the development of autonomous samplers that would sample nucleic acids and upload NGS sequence data to the cloud for network reconstruction. Large numbers of these samplers, in a global array, would allow sensitive automated biomonitoring of the Earth's major ecosystems at high spatial and temporal resolution, revolutionising our understanding of ecosystem change

    Automated Discovery of Food Webs from Ecological Data Using Logic-Based Machine Learning

    Get PDF
    Networks of trophic links (food webs) are used to describe and understand mechanistic routes for translocation of energy (biomass) between species. However, a relatively low proportion of ecosystems have been studied using food web approaches due to difficulties in making observations on large numbers of species. In this paper we demonstrate that Machine Learning of food webs, using a logic-based approach called A/ILP, can generate plausible and testable food webs from field sample data. Our example data come from a national-scale Vortis suction sampling of invertebrates from arable fields in Great Britain. We found that 45 invertebrate species or taxa, representing approximately 25% of the sample and about 74% of the invertebrate individuals included in the learning, were hypothesized to be linked. As might be expected, detritivore Collembola were consistently the most important prey. Generalist and omnivorous carabid beetles were hypothesized to be the dominant predators of the system. We were, however, surprised by the importance of carabid larvae suggested by the machine learning as predators of a wide variety of prey. High probability links were hypothesized for widespread, potentially destabilizing, intra-guild predation; predictions that could be experimentally tested. Many of the high probability links in the model have already been observed or suggested for this system, supporting our contention that A/ILP learning can produce plausible food webs from sample data, independent of our preconceptions about “who eats whom.” Well-characterised links in the literature correspond with links ascribed with high probability through A/ILP. We believe that this very general Machine Learning approach has great power and could be used to extend and test our current theories of agricultural ecosystem dynamics and function. In particular, we believe it could be used to support the development of a wider theory of ecosystem responses to environmental change
    corecore